Skip to content

spec: voice input hotkey picker — Caps Lock, F-keys, custom (#9869)#10127

Open
lonexreb wants to merge 5 commits intowarpdotdev:masterfrom
lonexreb:spec/9869-voice-hotkey-picker
Open

spec: voice input hotkey picker — Caps Lock, F-keys, custom (#9869)#10127
lonexreb wants to merge 5 commits intowarpdotdev:masterfrom
lonexreb:spec/9869-voice-hotkey-picker

Conversation

@lonexreb
Copy link
Copy Markdown
Contributor

@lonexreb lonexreb commented May 5, 2026

Summary

Spec for issue #9869 — the voice input hotkey picker in Settings → Agents → Voice offers a fixed list of 10 modifier/function keys, missing Caps Lock and individual F-keys (F1–F12), and providing no escape hatch for any other key.

Investigation

  • Setting is VoiceInputToggleKey enum at app/src/settings/ai.rs:110. Closed enum with hand-written display_name, to_key_code, keystroke, and tooltip_message.
  • Underlying KeyCode at crates/warpui_core/src/platform/keyboard.rs already exposes CapsLock and F1F35; the enum is artificially narrower than what the platform layer can identify.
  • Persistence is SyncToCloud::Never (intentional — same person uses different keyboards on different devices). Preserved.
  • Action dispatch at block_list_element.rs:3072 and alt_screen_element.rs:620 already compares incoming key_code: KeyCode to the configured one, so new variants don't require dispatch-layer changes.

What's in the spec

product.md — 8 testable behavior invariants (B1–B8), 6 acceptance criteria (A1–A6), explicit non-goals (chord hotkeys, per-tab/profile, cloud sync), and 5 risks with concrete TECH decisions to make.

tech.md — picks the data-model approach with explicit reasoning, names every file and line range touched, and lays out 6 unit tests + 4 modal tests + 3 integration tests.

Architectural choices

  • Custom(KeyCode) variant on the existing enum, not a separate override field. Single source of truth for "what key fires voice input"; avoids a "two settings disagree" failure mode.
  • Sectioned dropdown ("Modifier keys" / "Function keys" / "Other") to keep 22 entries scannable. Existing modifiers stay at the top so existing users see no reorder.
  • Press-to-capture modal for the "Custom key…" entry, with a reject-list (app-quit shortcut, Enter/Escape/Tab/Backspace/Space) and an >800ms held-modifier carve-out so users can still bind a bare modifier through this path.
  • macOS Caps Lock uses press-to-toggle semantics (one tap on, next tap off) because macOS Caps Lock fires only on toggle events, not press/release. Discord and Krisp both take this approach.
  • Validation reset for hand-edited TOML with unbindable values, with a one-time toast.

Test plan

  • Unit T1–T6 (TOML round-trip, display name, key mapping, toggle-mode, validation reset)
  • Modal T7–T10 (capture flow, reject-list, Escape, held-modifier carve-out)
  • Integration IT1–IT3 (full modal flow, Caps Lock toggle on macOS, F-key hold-to-talk)
  • Manual: existing user with voice_input_toggle_key = "alt_left" in TOML sees no behavior change after upgrade
  • Manual: hand-edited voice_input_toggle_key = "enter" shows toast and resets to None

Open questions for maintainer review

  1. Does the SettingsValue derive macro support enum variants with payloads (Custom(KeyCode)), or do we need a manual Serialize/Deserialize impl?
  2. Is there an existing DropdownItem::heading() constructor for sectioning, or should this PR add one?
  3. Confirm the macOS Caps Lock press-to-toggle decision (vs documenting the difference and not handling).
  4. Should the press-to-capture modal expose the captured KeyCode spelling (debug aid) or only the friendly display name?

Closes (spec-only) #9869 — implementation PR will follow once spec direction is confirmed.

…ev#9869)

Adds product.md + tech.md for issue warpdotdev#9869: the voice input hotkey
picker in Settings > Agents > Voice exposes a fixed list of 10
modifier/function keys, missing Caps Lock and individual F-keys
(F1-F12) — the two most-requested options for push-to-talk in real
ergonomic setups — and offering no escape hatch for any other key.

Investigation:
- Setting is VoiceInputToggleKey enum at app/src/settings/ai.rs:110.
  Closed enum with hand-written display_name, to_key_code, keystroke,
  and tooltip_message methods.
- Underlying KeyCode at crates/warpui_core/src/platform/keyboard.rs
  already supports CapsLock and F1-F35; the enum is artificially
  narrower than the platform layer can handle.
- Persistence is SyncToCloud::Never (intentional, per-device
  keyboards) — preserved.
- Action dispatch at block_list_element.rs:3072 and
  alt_screen_element.rs:620 already speaks KeyCode, so new variants
  don't require dispatch-layer changes.

Spec proposes:
- Extend the enum with CapsLock, F1-F12, and a Custom(KeyCode)
  variant (single source of truth, justified against an
  override-field alternative).
- Sectioned dropdown ("Modifier keys" / "Function keys" / "Other")
  to keep 22 entries scannable; existing modifiers stay at the top
  so existing users see no reorder.
- Press-to-capture modal for "Custom key..." with a B5 reject-list
  (app-quit shortcut, Enter/Escape/Tab/Backspace/Space) and an
  >800ms held-modifier carve-out.
- Caps Lock on macOS uses press-to-toggle semantics (one tap on,
  next tap off) because macOS Caps Lock fires only on toggle
  events, not press/release. Discord and Krisp both take this
  approach.
- Validation reset for hand-edited TOML with unbindable values,
  with a one-time toast.

Test plan covers ten unit cases (TOML round-trip, display, key
mapping, toggle-mode, validation reset) and three integration
tests (modal capture flow, Caps Lock toggle, F-key hold-to-talk).

Four open questions for maintainer review on SettingsValue derive
support for payload variants, dropdown section heading
constructors, the macOS Caps Lock decision, and KeyCode display
exposure.
@cla-bot cla-bot Bot added the cla-signed label May 5, 2026
@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 5, 2026

@lonexreb

I'm starting a first review of this spec-only pull request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

@github-actions github-actions Bot added the external-contributor Indicates that a PR has been opened by someone outside the Warp team. label May 5, 2026
Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This spec adds product and technical direction for expanding the voice input hotkey picker with Caps Lock, F-keys, and a custom press-to-capture path. The scope aligns with GH-9869, but several requirements currently contradict each other or omit implementation details needed to avoid a broken follow-up implementation.

Concerns

  • Product and tech specs disagree on whether macOS Caps Lock is hold-to-talk or tap-to-toggle.
  • The chosen Custom(KeyCode) model needs a concrete replacement for the current finite enum iteration and a settings-file serialization plan that works with KeyCode.
  • The invalid TOML reset/toast design does not explain how validation sees values that fail deserialization.
  • Shortcut rejection needs full keystroke/modifier semantics, not just a KeyCode reject list.

Security

  • Telemetry should explicitly exclude raw capture attempts and rejected/cancelled keypresses, recording only the final saved hotkey value.

Verdict

Found: 1 critical, 4 important, 1 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread specs/GH9869/product.md Outdated
### B1 — Caps Lock as a first-class option

Caps Lock appears in the dropdown on all platforms (macOS, Windows,
Linux). When selected, holding Caps Lock activates voice input and
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 [CRITICAL] This says Caps Lock is hold/release on all platforms, but tech.md requires macOS press-to-toggle; make the product behavior and acceptance criteria explicitly platform-specific so implementation has one source of truth.

Comment thread specs/GH9869/tech.md Outdated
CapsLock, // NEW
F1, F2, F3, F4, F5, F6, // NEW
F7, F8, F9, F10, F11, F12, // NEW
Custom(KeyCode), // NEW
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] Adding Custom(KeyCode) to the existing enum breaks the current EnumIter/all_possible_values() dropdown flow and requires a SettingsValue path for KeyCode; specify an explicit predefined-option list plus the manual Custom settings-file handling.

Comment thread specs/GH9869/tech.md Outdated
```
For `Custom`:
```toml
voice_input_toggle_key = { custom = "F19" }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] The custom TOML shape conflicts with the earlier { kind = "custom", key = "f19" } fallback and product.md's lowercase { custom = "f19" }; choose one canonical format and make the tests assert it.

Comment thread specs/GH9869/tech.md Outdated
`AISettingsPageAction::SetVoiceInputToggleKey(VoiceInputToggleKey::Custom(key_code))`.
- On Cancel or Escape, closes without dispatch.
- The modal's reject-list (B5) is constructed at open time from:
- The currently-bound app-quit shortcut (read from key bindings).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] The app-quit shortcut is usually a chord, but this modal captures a KeyCode; specify tracking modifiers and comparing a full Keystroke so Cmd+Q is rejected without banning bare Q.

Comment thread specs/GH9869/tech.md Outdated
deserializer falls back to `Default` today. We need to additionally
fire a one-time toast.

Implementation: add a `validate()` method on `VoiceInputToggleKey`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] This validation hook cannot detect voice_input_toggle_key = "enter" if deserialization has already failed/defaulted; specify where the raw parse failure is surfaced so A6's one-time toast actually fires.

Comment thread specs/GH9869/product.md

### B8 — Telemetry: which key was bound

When the user changes the voice input hotkey, fire the existing
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 [SUGGESTION] [SECURITY] Clarify telemetry fires only after Confirm/save and records only the final enum/KeyCode label, not raw key events, rejected keys, or cancelled captures.

Bot review (warpdotdev#10127) identified 1 critical + 4 important + 1 security
suggestion concerns:

- Critical: product and tech specs disagreed on whether macOS Caps
  Lock is hold-to-talk or tap-to-toggle.
- Important: Custom(KeyCode) breaks the existing EnumIter /
  all_possible_values() flow with no concrete replacement.
- Important: TOML format for Custom variant was specified two ways
  (`{ kind = "custom", key = "f19" }` and `{ custom = "F19" }`).
- Important: app-quit reject-list compared bare KeyCode, not
  Keystroke — would either ban bare Q or fail to reject Cmd+Q.
- Important: validation hook can never see TOML values that fail
  parsing, so A6's toast would never fire.
- Suggestion: telemetry must explicitly exclude raw capture
  attempts and rejected/cancelled keypresses.

Fixes:

- B1 split into platform-specific semantics: macOS press-to-toggle
  (one tap on, next tap off — matches macOS Caps Lock event model
  and Discord/Krisp UX), Windows/Linux hold-to-talk. A4 acceptance
  criterion split into A4-mac and A4-win-lin.
- Added "Replacing the EnumIter / all_possible_values() flow"
  section with a concrete predefined_options() function that returns
  the curated unit variants only. Custom(KeyCode) is constructed at
  capture time, not iterated. EnumIter derive is dropped.
- Added "Settings serialization" section justifying the hand-written
  Serialize/Deserialize impl over a one-off macro extension.
- Canonicalized the TOML format to `{ custom = "f19" }` with
  KeyCode payload as snake_case lowercased Serde-default. Tests
  assert exact strings.
- Reject-list comparison now uses full Keystroke (key + modifiers)
  for app-quit and the existing voice-input shortcut. Static
  structural rejects (Enter/Escape/Tab/Backspace/Space) remain
  bare-key. The bare-modifier rule is preserved via the >800ms held-
  modifier carve-out.
- Settings-reset (A6) split into two paths: raw parse failure
  (intercepted in the hand-written Deserialize impl, surfaced via a
  side-channel Mutex<Option<String>>) and parsed-but-unbindable
  (post-parse validate() check). Both fire the same toast.
- Added explicit telemetry privacy guardrails to B8: only the final
  saved value is emitted, never raw captures, rejected keys, or
  cancelled captures; respects global opt-out.

Each correction includes a "Correction (review warpdotdev#10127)" callout so
reviewers can cross-reference the bot's concerns.
@lonexreb
Copy link
Copy Markdown
Contributor Author

lonexreb commented May 5, 2026

Pushed corrections addressing all 6 bot review concerns (1 critical + 4 important + 1 security):

  • Critical: macOS Caps Lock contradiction. B1 split into platform-specific semantics: macOS uses press-to-toggle (matches the macOS Caps Lock event model and Discord/Krisp UX), Windows/Linux use hold-to-talk. A4 split into A4-mac and A4-win-lin.
  • Important: Custom(KeyCode) breaks EnumIter/all_possible_values(). Added a "Replacing the EnumIter flow" section with a concrete predefined_options() returning curated unit variants. Custom is constructed at capture time, not iterated. EnumIter derive is dropped.
  • Important: serialization plan. Added "Settings serialization" section recommending a hand-written Serialize/Deserialize impl over a one-off implement_setting_for_enum! extension.
  • Important: TOML format conflict. Canonicalized to { custom = "f19" } with the KeyCode payload as snake_case lowercased. Tests assert exact strings.
  • Important: app-quit reject-list. Now compares full Keystroke (key + modifiers), so Cmd+Q is rejected without banning bare Q. Static structural rejects (Enter/Escape/Tab/Backspace/Space) remain bare-key.
  • Important: validation flow. A6 split into Path 1 (raw parse failure intercepted in the hand-written Deserialize impl via a side-channel Mutex<Option<String>>) and Path 2 (parsed-but-unbindable post-parse validate()). Both fire the same toast.
  • Security: telemetry guardrails. B8 now explicitly limits telemetry to the final saved value — never raw captures, rejected keys, or cancelled captures. Respects global opt-out.

Each correction includes a "Correction (review #10127)" callout in the spec for cross-reference. Ready for re-review.

/oz-review

@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 5, 2026

@lonexreb

I'm re-reviewing this spec-only pull request in response to a review request.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This spec covers the requested Caps Lock, F-key, and custom-key voice input picker behavior with product and technical detail. The overall scope aligns with the linked issue, and the privacy treatment for captured keys is explicit.

Concerns

  • Acceptance criteria still require hold/release semantics for every bound key, contradicting the macOS Caps Lock tap-to-toggle behavior defined earlier.
  • The technical spec's voice-input self-conflict rule would prevent users from changing an existing custom hotkey.
  • The proposed invalid-TOML recovery path relies on a field deserializer mutating the containing settings object, which is not a feasible Serde design as written.
  • Product and tech specs disagree on whether invalid/unbindable settings should leave TOML unchanged or rewrite it.

Verdict

Found: 1 critical, 3 important, 1 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread specs/GH9869/product.md Outdated
Comment on lines +189 to +191
A4. Holding the bound key activates voice input; releasing
deactivates it. Pixel-equivalent to today's behavior for the
existing modifier keys.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] A4 still requires hold/release for every bound key, contradicting B1's macOS Caps Lock tap-to-toggle split. Split A4 here too or exclude macOS Caps Lock so acceptance tests do not implement both semantics.

Comment thread specs/GH9869/tech.md Outdated
Comment on lines +222 to +225
- **Currently-bound voice-input shortcut:** allow re-confirming
the same key (so the user can re-open and confirm without
error), but reject if the user picks a different value than
the existing one for any reason. Same `Keystroke` comparison.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 [CRITICAL] This rejects any different custom key, so the modal cannot actually change an existing voice hotkey. The voice-input self-conflict rule should reject only recursive activation while capture is open or the exact conflicting shortcut, not every value different from the current setting.

Comment thread specs/GH9869/tech.md Outdated
Comment on lines +323 to +326
2. **On parse failure, return a special `Default` value tagged with
the original raw string** instead of erroring out. The tag is
stored in a side-channel: a `parking_lot::Mutex<Option<String>>`
on `AISettings` named `voice_input_toggle_key_raw_parse_failure`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] A field-level Deserialize impl cannot write into a side-channel on the containing AISettings instance while that instance is being constructed. Specify a feasible mechanism, such as manual AISettings deserialization that captures the raw value or a wrapper type returned from parsing.

Comment thread specs/GH9869/tech.md Outdated
Comment on lines +345 to +346
- Reset the in-memory value to `None`.
- Re-write the TOML on disk with the corrected value.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] This contradicts product A6, which says corrupted TOML is left unchanged. Pick one persistence behavior for invalid/unbindable values and align product, tech, tests, and toast wording.

Comment thread specs/GH9869/tech.md Outdated
Comment on lines +415 to +417
1. Does the `SettingsValue` derive macro support enum variants with
payloads (`Custom(KeyCode)`), or do we need a manual
`Serialize`/`Deserialize` impl?
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 [SUGGESTION] This open question is already answered in the spec: the macro does not support payload variants and the recommendation is manual serialization. Remove it or rephrase it as a maintainer decision needing confirmation.

Round-2 bot review identified 4 remaining concerns:

- A4 acceptance criterion still required hold/release semantics
  for every bound key, contradicting B1's macOS Caps Lock tap-to-
  toggle that I added in round 1.
- The reject-list rule for the currently-bound voice-input
  shortcut would have prevented users from changing their existing
  custom hotkey — the literal opposite of the modal's purpose.
- Path 1 of the A6 reset flow proposed mutating a side-channel
  Mutex on the containing AISettings struct from inside the field
  Deserialize impl. That isn't feasible Serde — a field
  deserializer doesn't have a reference to its container.
- Product and tech specs still disagreed on whether invalid TOML
  is left unchanged or rewritten.

Fixes:

- A4 split into A4-win-lin (hold-to-talk) and A4-mac (Caps Lock
  tap-to-toggle, all other keys hold-to-talk). The split mirrors
  B1's existing platform breakdown.
- Reject-list: removed the "reject voice-input self-conflict"
  rule. Same-key re-confirm is handled by accepting the capture
  and short-circuiting the dispatch when the captured KeyCode
  equals the current one. The dispatch handler at
  block_list_element.rs:3072 only fires voice activation, never
  re-binds, so there is no infinite-loop risk.
- A6 reset path rewritten with a feasible Serde design: the
  storage type is a #[serde(untagged)] wrapper enum
  VoiceInputTomlValue { Valid(VoiceInputToggleKey),
  Invalid(toml::Value) } that captures both successful and failed
  parses without erroring out. A single post-init pass on
  AISettings::initialize drains the Invalid branch, logs, rewrites
  the TOML on disk, and fires the toast. No side-channel mutation
  from inside a deserializer.
- Path 2 (parsed-but-unbindable) folded into the same routine,
  guaranteeing A6's toast fires once for any combination of TOML-
  parse-failure and post-parse-validation-failure.
- product.md A6 updated to "TOML is rewritten" so it matches
  tech.md.

Each correction includes a "Correction (re-review warpdotdev#10127)" callout.
@lonexreb
Copy link
Copy Markdown
Contributor Author

lonexreb commented May 5, 2026

Round-2 corrections pushed (commit 7387e88):

  • A4 / Caps Lock contradiction: A4 split into A4-win-lin (hold-to-talk for all keys) and A4-mac (Caps Lock tap-to-toggle, all other keys hold-to-talk). Mirrors B1's existing platform breakdown.
  • Self-conflict rule preventing custom-hotkey changes: removed the "reject voice-input self-conflict" rule from the modal reject-list. Same-key re-confirm is handled by accepting the capture and short-circuiting the dispatch when the captured KeyCode equals the current one. The dispatch handler at block_list_element.rs:3072 only fires voice activation, never re-binds — no infinite-loop risk.
  • Field deserializer feasibility: replaced the side-channel-Mutex-from-inside-Deserialize design (not feasible Serde) with a #[serde(untagged)] wrapper enum VoiceInputTomlValue { Valid(...), Invalid(toml::Value) }. Standard Serde, no container-mutation from the field deserializer. A single post-init pass on AISettings::initialize drains the Invalid branch, logs, rewrites the TOML, fires the toast.
  • TOML rewrite vs. left-unchanged: product.md A6 updated to "TOML is rewritten" so the toast fires once, not on every launch. Now matches tech.md.

/oz-review

@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 5, 2026

@lonexreb

I'm re-reviewing this spec-only pull request in response to a review request.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

Reviewed the spec-only PR for GH-9869. The product and technical specs cover the intended Caps Lock, F-key, and custom-key expansion, including compatibility, platform-specific Caps Lock behavior, persistence, tests, and privacy guardrails.

Concerns

  • The product and technical specs still disagree on whether the capture flow rejects the currently-bound voice-input shortcut.
  • The technical spec does not define how a persisted Custom(KeyCode) is displayed/selected in the dropdown once that value is outside predefined_options().
  • The modal display requirements conflict with an unresolved open question about whether to show protocol-level KeyCode spelling.

Verdict

Found: 0 critical, 2 important, 1 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread specs/GH9869/product.md Outdated
The press-to-capture flow rejects (with an inline error message and
no save) any key that would brick the user's keyboard:
- The currently-bound app-quit shortcut.
- The currently-bound voice-input shortcut itself (no infinite loop).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] This contradicts tech.md, which says the existing voice-input value is not rejected and same-key capture is accepted as a no-op; pick one behavior so the product contract and modal implementation/tests align.

Comment thread specs/GH9869/tech.md
Comment on lines +95 to +99
`Custom(KeyCode)` is **not** part of `predefined_options()` — it is
constructed by the press-to-capture modal at runtime. The dropdown
shows a synthetic "Custom key…" entry that opens the modal; once a
key is captured, the resulting `Custom(KeyCode)` is set as the
current value but does not appear in the predefined list.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] If a captured Custom(KeyCode) becomes the current value but is absent from the item list, the spec does not define how the dropdown trigger, selected state, or reload path render it; specify whether to inject a selected Custom: F19 row, special-case the trigger label, or use another supported dropdown pattern.

Comment thread specs/GH9869/product.md
Comment on lines +107 to +108
- Confirms the captured key by name and `KeyCode` (e.g.
"Selected: F19 (KeyCode::F19)").
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 [SUGGESTION] This requires showing KeyCode in the modal, while B7 and tech.md's open question lean toward friendly labels; resolve the modal copy so implementers do not ship different UX.

…ec (round 3)

Round-3 bot review identified 2 important + 1 suggestion:

- Product B5 still listed "the currently-bound voice-input
  shortcut itself" as a reject-list entry, contradicting the
  tech-spec round-2 fix that removed it (it would have prevented
  users from changing an existing custom hotkey).
- Tech spec did not specify how the dropdown displays the
  *currently selected* value when that value is a Custom(KeyCode)
  outside predefined_options(). Without spec, the
  set_selected_by_index(None) path would silently fall back to
  index 0 — a data-loss bug.
- Modal display open question still unresolved (whether to show
  protocol-level KeyCode spelling).

Fixes:

- B5 product reject-list updated to drop the voice-input self-
  conflict entry, matching tech.md round 2.
- Added "Displaying a persisted Custom(KeyCode) in the dropdown"
  section: virtual "Custom: <friendly-name>" row prepended when
  the current value is a Custom not in predefined_options();
  re-opens the modal pre-populated when clicked. No data loss on
  render.
- Resolved open question 4: friendly display name primary,
  protocol-level KeyCode in hover tooltip only. Same friendly
  name on dropdown row and modal display.

Each correction includes a "Correction (re-review warpdotdev#10127)" callout.
@lonexreb
Copy link
Copy Markdown
Contributor Author

lonexreb commented May 5, 2026

Round-3 corrections pushed (commit e8eca85):

  • Self-conflict in B5: removed "currently-bound voice-input shortcut" from the product reject-list. Tech round-2 already removed it; product now matches.
  • Persisted Custom(KeyCode) display: new "Displaying a persisted Custom(KeyCode) in the dropdown" section. Virtual "Custom: " row prepended when the current value isn't in predefined_options(); clicking re-opens the modal pre-populated. No silent fallback to index 0 (which would have caused data loss).
  • Open question 4 resolved: friendly display name primary; protocol-level KeyCode in hover tooltip only. Same string on dropdown row and modal.

/oz-review

@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 5, 2026

@lonexreb

I'm re-reviewing this spec-only pull request in response to a review request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

Reviewed the product and technical specs for expanding the voice input hotkey picker to Caps Lock, F-keys, and custom single-key capture. The scope matches GH-9869 and the specs cover the main product behavior, persistence, UI, and test plan.

Concerns

  • The product and technical specs still disagree about whether the captured KeyCode is shown inline in the modal or only in a hover tooltip.
  • The invalid-TOML wrapper design is not reconciled with the generated settings API that later examples and tests still rely on.
  • One open question is stale after the serialization section already chooses the manual implementation path.

Verdict

Found: 0 critical, 2 important, 1 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread specs/GH9869/product.md
Comment on lines +107 to +108
- Confirms the captured key by name and `KeyCode` (e.g.
"Selected: F19 (KeyCode::F19)").
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] B3 requires the modal to show KeyCode inline, but tech.md resolves this to friendly name inline with KeyCode only in a hover tooltip. Pick one product-visible behavior so implementation and tests don't diverge.

Comment thread specs/GH9869/tech.md Outdated
Comment on lines +389 to +391
// AISettings::voice_input_toggle_key is internally
// VoiceInputTomlValue, with a public accessor that returns
// VoiceInputToggleKey (mapping Invalid -> None on read).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] This wrapper changes the setting field's internal type, but later examples and tests still use the generated Setting API as if voice_input_toggle_key.value() returns VoiceInputToggleKey and set_value() accepts VoiceInputToggleKey. Specify the exact setting type/Setting::Value integration or update the API expectations before implementation.

Comment thread specs/GH9869/tech.md Outdated
Comment on lines +490 to +492
1. Does the `SettingsValue` derive macro support enum variants with
payloads (`Custom(KeyCode)`), or do we need a manual
`Serialize`/`Deserialize` impl?
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 [SUGGESTION] This question is stale now that the settings serialization section recommends manual Serialize/Deserialize; either remove it or reframe it as a maintainer confirmation of that decision.

- Modal body display reconciled: friendly name only, KeyCode in
  hover tooltip. Earlier draft had inline KeyCode spelling that
  contradicted the round-3 Open Question 4 resolution.
- Open Question 1 (SettingsValue derive macro) closed: committed
  to manual Serialize/Deserialize impl per round-2 Settings
  serialization section. Open Question 4 was already closed.
- VoiceInputTomlValue wrapper API reconciled: wrapper is an
  internal implementation detail. All external accessors
  (set_value, value(), tests, dropdown reads, dispatch) continue
  to see the unwrapped VoiceInputToggleKey. After post-init the
  field is always Valid(...), with the unwrap happening in the
  accessor. Tests don't see the wrapper.
@lonexreb
Copy link
Copy Markdown
Contributor Author

lonexreb commented May 6, 2026

Round-4 corrections (commit ccbec29):

  • Modal body now shows friendly name only with KeyCode in hover tooltip — matches the round-3 resolution of Open Question 4. Removed the contradictory Selected: F19 (KeyCode::F19) inline form.
  • Open Question 1 closed: committed to manual Serialize/Deserialize impl per round-2 Settings serialization section.
  • VoiceInputTomlValue wrapper now explicitly an internal implementation detail. All external accessors (set_value, value(), tests, dropdown reads, dispatch) continue to see the unwrapped VoiceInputToggleKey. After post-init the field is always Valid(...).
    /oz-review

@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 6, 2026

@lonexreb

I'm re-reviewing this spec-only pull request in response to a review request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This spec covers the requested Caps Lock, F-key, and custom-key expansion for the voice input hotkey picker and includes product behavior, technical design, and validation coverage. The main direction is aligned with issue #9869, and the telemetry section includes appropriate privacy guardrails.

Concerns

  • Product and technical specs still disagree on whether the custom capture modal shows the protocol KeyCode inline or only in a tooltip.
  • The current-implementation section says dispatch needs no changes for new variants, but the same tech spec later requires dispatch changes for macOS Caps Lock toggle mode.
  • The corrupted-TOML reset design does not match the current settings framework's per-setting SettingsValue load path, so A6 is not implementable as written.
  • The custom KeyCode TOML wording should avoid implying that KeyCode's default Serde output is already snake_case.

Verdict

Found: 0 critical, 3 important, 1 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread specs/GH9869/product.md
A new dropdown entry "Custom key…" opens a modal that prompts
"Press the key you want to use for voice input." The user presses
any single key. The modal:
- Confirms the captured key by name and `KeyCode` (e.g.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] B3 says the modal confirms the captured key by name and KeyCode inline, but tech.md later resolves that the KeyCode spelling appears only in a tooltip. Align the product behavior so implementation and tests do not disagree on user-facing copy.

Comment thread specs/GH9869/tech.md
and the long tail.
- **Action dispatch:** `block_list_element.rs:3072` and
`alt_screen_element.rs:620` compare incoming `key_code` to the
configured one. No code changes needed there for new variants —
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] This says dispatch needs no code changes for new variants, but the spec later requires changing both dispatch handlers for macOS Caps Lock toggle mode. Narrow this statement to non-toggle keys or call out the Caps Lock exception.

Comment thread specs/GH9869/tech.md
```

For `Custom(KeyCode)` — TOML inline-table with one key, `custom`,
and the `KeyCode` value as its lowercased Serde-default
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 [SUGGESTION] KeyCode currently derives Serde without rename_all, so default Serde output is not snake_case. Reword this to say the manual VoiceInputToggleKey settings-file implementation defines the snake_case mapping, rather than implying KeyCode already provides it.

Comment thread specs/GH9869/tech.md
> while tech.md said it was rewritten. Resolved to **rewrite** so
> the toast fires once, not on every launch.

The TOML field uses a wrapper enum that captures both successful
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] The reset design assumes AISettings is deserialized as a struct field, but settings are loaded per-setting through Setting::read_from_preferences and SettingsValue::from_file_value, which returns None and inhibits writes on parse failure before a post-init wrapper can inspect raw TOML. Specify the actual interception point, such as a custom settings-value sentinel or a settings read-path change, so A6 is implementable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed external-contributor Indicates that a PR has been opened by someone outside the Warp team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant